Chapter 9 MINING TEXT STREAMS
نویسنده
چکیده
The large amount of text data which are continuously produced over time in a variety of large scale applications such as social networks results in massive streams of data. Typically massive text streams are created by very large scale interactions of individuals, or by structured creations of particular kinds of content by dedicated organizations. An example in the latter category would be the massive text streams created by news-wire services. Such text streams provide unprecedented challenges to data mining algorithms from an efficiency perspective. In this chapter, we review text stream mining algorithms for a wide variety of problems in data mining such as clustering, classification and topic modeling. We also discuss a number of future challenges in this area of
منابع مشابه
Research on Online Topic Evolutionary Pattern Mining in Text Streams
Text Streams are a class of ubiquitous data that came in over time and are extraordinary large in scale that we often lose track of. Basically, text streams forms the fundamental source of information that can be used to detect semantic topic which individuals and organizations are interested in as well as detect burst events within communities. Thus, intelligent system that can automatically e...
متن کاملA Semantic Graph-Based Approach for Mining Common Topics from Multiple Asynchronous Text Streams
In the age of Web 2.0, a substantial amount of unstructured content are distributed through multiple text streams in an asynchronous fashion, which makes it increasingly difficult to glean and distill useful information. An effective way to explore the information in text streams is topic modelling, which can further facilitate other applications such as search, information browsing, and patter...
متن کاملBursty Feature Representation for Clustering Text Streams
Text representation plays a crucial role in classical text mining, where the primary focus was on static text. Nevertheless, well-studied static text representations including TFIDF are not optimized for non-stationary streams of information such as news, discussion board messages, and blogs. We therefore introduce a new temporal representation for text streams based on bursty features. Our bur...
متن کاملA Novel Method to Intelligently Mine Social Media to Assess Consumer Sentiment of Pharmaceutical Drugs
................................................................................................................................. 1 Acknowledgements ................................................................................................................. 2 Abbreviations .........................................................................................................................
متن کاملText mining for systems biology and MetNet
............................................................................................................................ iv Chapter 1. Background: text mining of biological literature for interaction extraction ........................................................................................................................... 1 1.1 Review of interaction extraction methods ..............
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012